78 research outputs found
Holistic System Design for Deterministic Replay.
Deterministic replay systems record and reproduce the execution of a hardware or software system. While it is well known how to replay uniprocessor systems, it is much harder to provide deterministic replay of shared memory multithreaded programs on multiprocessors because shared memory accesses add a high-frequency source of non-determinism. This thesis proposes efficient multiprocessor replay systems: Respec, Chimera, and Rosa.
Respec is an operating-system-based replay system. Respec is based on the observation that most program executions are data-race-free and for programs with no data races it is sufficient to record program input and the happens-before order of synchronization operations for replay. Respec speculates that a program is data-race-free and supports rollback and recovery from misspeculation. For racy programs, Respec employs a cheap runtime check that compares system call outputs and memory/register states of recorded and replayed processes at a semi-regular interval.
Chimera uses a sound static data race detector to find all potential data races and instrument pairs of potentially racing instructions to transform an arbitrary program to make it data-race-free. Then, Chimera records only the non-deterministic inputs and the order of synchronization operations for replay. However, existing static data race detectors generate excessive false warnings, leading to high recording overhead. Chimera resolves this problem by employing a combination of profiling, symbolic analysis, and dynamic checks that target the sources of imprecision in the static data race detector.
Rosa is a processor-based ultra-low overhead (less than one percent) replay solution that requires very little hardware support as it essentially only needs a log of cache misses to reproduce a multiprocessor execution. Unlike previous hardware-assisted systems, Rosa does not record shared memory dependencies at all. Instead, it infers them offline using a Satisfiability Modulo Theories (SMT) solver. Our offline analysis is capable of inferring interleavings that are legal under the Sequentially Consistency (SC) and Total Store Order (TSO) memory models.PhDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/102374/1/dongyoon_1.pd
Towards Validating Long-Term User Feedbacks in Interactive Recommendation Systems
Interactive Recommender Systems (IRSs) have attracted a lot of attention, due
to their ability to model interactive processes between users and recommender
systems. Numerous approaches have adopted Reinforcement Learning (RL)
algorithms, as these can directly maximize users' cumulative rewards. In IRS,
researchers commonly utilize publicly available review datasets to compare and
evaluate algorithms. However, user feedback provided in public datasets merely
includes instant responses (e.g., a rating), with no inclusion of delayed
responses (e.g., the dwell time and the lifetime value). Thus, the question
remains whether these review datasets are an appropriate choice to evaluate the
long-term effects of the IRS. In this work, we revisited experiments on IRS
with review datasets and compared RL-based models with a simple reward model
that greedily recommends the item with the highest one-step reward. Following
extensive analysis, we can reveal three main findings: First, a simple greedy
reward model consistently outperforms RL-based models in maximizing cumulative
rewards. Second, applying higher weighting to long-term rewards leads to a
degradation of recommendation performance. Third, user feedbacks have mere
long-term effects on the benchmark datasets. Based on our findings, we conclude
that a dataset has to be carefully verified and that a simple greedy baseline
should be included for a proper evaluation of RL-based IRS approaches.Comment: Accepted to SIGIR'2
On the Importance of Feature Decorrelation for Unsupervised Representation Learning in Reinforcement Learning
Recently, unsupervised representation learning (URL) has improved the sample
efficiency of Reinforcement Learning (RL) by pretraining a model from a large
unlabeled dataset. The underlying principle of these methods is to learn
temporally predictive representations by predicting future states in the latent
space. However, an important challenge of this approach is the representational
collapse, where the subspace of the latent representations collapses into a
low-dimensional manifold. To address this issue, we propose a novel URL
framework that causally predicts future states while increasing the dimension
of the latent manifold by decorrelating the features in the latent space.
Through extensive empirical studies, we demonstrate that our framework
effectively learns predictive representations without collapse, which
significantly improves the sample efficiency of state-of-the-art URL methods on
the Atari 100k benchmark. The code is available at
https://github.com/dojeon-ai/SimTPR.Comment: Accepted to ICML 202
Improving Developers\u27 Understanding of Regex Denial of Service Tools through Anti-Patterns and Fix Strategies
Regular expressions are used for diverse purposes, including input validation and firewalls. Unfortunately, they can also lead to a security vulnerability called ReDoS (Regular Expression Denial of Service), caused by a super-linear worst-case execution time during regex matching. Due to the severity and prevalence of ReDoS, past work proposed automatic tools to detect and fix regexes. Although these tools were evaluated in automatic experiments, their usability has not yet been studied; usability has not been a focus of prior work. Our insight is that the usability of existing tools to detect and fix regexes will improve if we complement them with anti-patterns and fix strategies of vulnerable regexes.
We developed novel anti-patterns for vulnerable regexes, and a collection of fix strategies to fix them. We derived our anti-patterns and fix strategies from a novel theory of regex infinite ambiguity—a necessary condition for regexes vulnerable to ReDoS. We proved the soundness and completeness of our theory. We evaluated the effectiveness of our anti-patterns, both in an automatic experiment and when applied manually. Then, we evaluated how much our anti-patterns and fix strategies improve developers’ understanding of the outcome of detection and fixing tools. Our evaluation found that our anti-patterns were effective over a large dataset of regexes (N=209,188): 100% precision and 99% recall, improving the state of the art 50% precision and 87% recall. Our anti-patterns were also more effective than the state of the art when applied manually (N=20): 100% developers applied them effectively vs. 50% for the state of the art. Finally, our anti-patterns and fix strategies increased developers’ understanding using automatic tools (N=9): from median “Very weakly” to median “Strongly” when detecting vulnerabilities, and from median “Very weakly” to median “Very strongly” when fixing them
SlAction: Non-intrusive, Lightweight Obstructive Sleep Apnea Detection using Infrared Video
Obstructive sleep apnea (OSA) is a prevalent sleep disorder affecting
approximately one billion people world-wide. The current gold standard for
diagnosing OSA, Polysomnography (PSG), involves an overnight hospital stay with
multiple attached sensors, leading to potential inaccuracies due to the
first-night effect. To address this, we present SlAction, a non-intrusive OSA
detection system for daily sleep environments using infrared videos.
Recognizing that sleep videos exhibit minimal motion, this work investigates
the fundamental question: "Are respiratory events adequately reflected in human
motions during sleep?" Analyzing the largest sleep video dataset of 5,098
hours, we establish correlations between OSA events and human motions during
sleep. Our approach uses a low frame rate (2.5 FPS), a large size (60 seconds)
and step (30 seconds) for sliding window analysis to capture slow and long-term
motions related to OSA. Furthermore, we utilize a lightweight deep neural
network for resource-constrained devices, ensuring all video streams are
processed locally without compromising privacy. Evaluations show that SlAction
achieves an average F1 score of 87.6% in detecting OSA across various
environments. Implementing SlAction on NVIDIA Jetson Nano enables real-time
inference (~3 seconds for a 60-second video clip), highlighting its potential
for early detection and personalized treatment of OSA.Comment: Accepted to ICCV CVAMD 2023, poste
Verifiable Sustainability in Data Centers
Sustainability is crucial for combating climate change and protecting our
planet. While there are various systems that can pose a threat to
sustainability, data centers are particularly significant due to their
substantial energy consumption and environmental impact. Although data centers
are becoming increasingly accountable to be sustainable, the current practice
of reporting sustainability data is often mired with simple green-washing. To
improve this status quo, users as well as regulators need to verify the data on
the sustainability impact reported by data center operators. To do so, data
centers must have appropriate infrastructures in place that provide the
guarantee that the data on sustainability is collected, stored, aggregated, and
converted to metrics in a secure, unforgeable, and privacy-preserving manner.
Therefore, this paper first introduces the new security challenges related to
such infrastructure, how it affects operators and users, and potential
solutions and research directions for addressing the challenges for data
centers and other industry segments
- …